Parameter Space Noise for Exploration

نویسندگان

  • Matthias Plappert
  • Rein Houthooft
  • Prafulla Dhariwal
  • Szymon Sidor
  • Richard Y. Chen
  • Xi Chen
  • Tamim Asfour
  • Pieter Abbeel
  • Marcin Andrychowicz
چکیده

Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent’s parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both offand on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring parameter space in reinforcement learning

This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameter-based exploration unifies reinforcement learning and black-box optimization, and has several advantages over action perturbation. We review two recent parameter-ex...

متن کامل

Online State Space Model Parameter Estimation in Synchronous Machines

The purpose of this paper is to present a new approach based on the Least Squares Error method for estimating the unknown parameters of the nonlinear 3rd order synchronous generator model. The proposed method uses the mathematical relationships between the machine parameters and on-line input/output measurements to estimate the parameters of the nonlinear state space model. The field voltage is...

متن کامل

Application of Single-Frequency Time-Space Filtering Technique for Seismic Ground Roll and Random Noise Attenuation

Time-frequency filtering is an acceptable technique for attenuating noise in 2-D (time-space) and 3-D (time-space-space) reflection seismic data. The common approach for this purpose is transforming each seismic signal from 1-D time domain to a 2-D time-frequency domain and then denoising the signal by a designed filter and finally transforming back the filtered signal to original time domain. ...

متن کامل

Optimal aeroacoustic shape design using the surrogate management framework

Shape optimization is applied to time-dependent trailing-edge flow in order to minimize aerodynamic noise. Optimization is performed using the surrogate management framework (SMF), a non-gradient based pattern search method chosen for its efficiency and rigorous convergence properties. Using SMF, design space exploration is performed not with the expensive actual function but with an inexpensiv...

متن کامل

WIZER: What-If Analyzer for Automated Social Model Space Exploration and Validation

_________________________________________________________________ Complex social problems modeled by multi-agent systems have very large parameter and model space. The problem of how to model, validate, detect, and plan for the event of bioterrorism is one of the these, as it requires faithful modeling of dynamic signal (bioattack event) from complex dynamic noise (normal disease outbreaks and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1706.01905  شماره 

صفحات  -

تاریخ انتشار 2017